11 research outputs found

    A summary of the 2012 JHU CLSP Workshop on Zero Resource Speech Technologies and Models of Early Language Acquisition

    Get PDF
    We summarize the accomplishments of a multi-disciplinary workshop exploring the computational and scientific issues surrounding zero resource (unsupervised) speech technologies and related models of early language acquisition. Centered around the tasks of phonetic and lexical discovery, we consider unified evaluation metrics, present two new approaches for improving speaker independence in the absence of supervision, and evaluate the application of Bayesian word segmentation algorithms to automatic subword unit tokenizations. Finally, we present two strategies for integrating zero resource techniques into supervised settings, demonstrating the potential of unsupervised methods to improve mainstream technologies.5 page(s

    Bandpass phase shifter and analytic signal generator

    No full text
    In this note, a novel tunable bandpass filter/phase shifter implementation (with a Hilbert transformer as a special case) is proposed. The filter can also be used to synthesize a bandpass analytic signal from a real-valued signal. The novelty is the simple, yet elegant implementation that exploits the even and odd symmetry of the in-phase and quadrature carrier modulation

    Synchrony capture filterbank: Auditory-inspired signal processing for tracking individual frequency components in speech

    No full text
    A processing scheme for speech signals is proposed that emulates synchrony capture in the auditory nerve. The role of stimulus-locked spike timing is important for representation of stimulus periodicity, low frequency spectrum, and spatial location. In synchrony capture, dominant single frequency components in each frequency region impress their time structures on temporal firing patterns of auditory nerve fibers with nearby characteristic frequencies (CFs). At low frequencies, for voiced sounds, synchrony capture divides the nerve into discrete CF territories associated with individual harmonics. An adaptive, synchrony capture filterbank (SCFB) consisting of a fixed array of traditional, passive linear (gammatone) filters cascaded with a bank of adaptively tunable, bandpass filter triplets is proposed. Differences in triplet output envelopes steer triplet center frequencies via voltage controlled oscillators (VCOs). The SCFB exhibits some cochlea-like responses, such as two-tone suppression and distortion products, and possesses many desirable properties for processing speech, music, and natural sounds. Strong signal components dominate relatively greater numbers of filter channels, thereby yielding robust encodings of relative component intensities. The VCOs precisely lock onto harmonics most important for formant tracking, pitch perception, and sound separation. © 2013 Acoustical Society of America

    Synchrony capture filterbank (SCFB): An auditory periphery inspired method for tracking sinusoids

    No full text
    We propose a novel algorithm for tracking multiple sinusoidal signals that is motivated by neural coding in the mammalian peripheral auditory system. A striking feature of auditory nerve activity is the phenomenon of synchrony capture, whereby the most intense frequency components in the stimulus dominate the temporal firing patterns of whole subpopulations of auditory nerve fibers (ANFs). A novel adaptive filterbank structure that emulates key aspects of synchrony capture is presented. The proposed filterbank has two components: a fixed bank of traditional gammatone (or equivalent) filters that are cascaded with a bank of adaptively-tunable bandpass filter triplets. The bandpass filters are tuned by using a voltage controlled oscillator (VCO) whose frequency is steered by a frequency discriminator loop (FDL). The resulting filterbank is used to process synthetic signals and speech. It is shown that the VCOs can track the low frequency harmonics in speech that evoke voice pitch at their fundamental (F0). For vowels, the VCOs faithfully track the strongest harmonic present in each formant region. © 2012 IEEE

    Multiple pitch identification using cochlear-like frequency capture and harmonic grouping

    No full text
    This work addresses the problem of identifying multiple fundamental frequencies in an acoustic signal. An auditory-inspired peripheral signal processing model is proposed that functions in a manner more like a bank of FM receivers rather than a traditional filterbank. Such receivers lock on to a strong signal (synchrony capture, frequency capture) even in the presence of nearby only slightly weaker signal components. Once the individual signal components are resolved, the model subjects them to an instantaneous nonlinearity and then performs harmonic grouping by cross correlating the isolated components. After the harmonically-related components are grouped, their pitches are computed using a standard summary autocorrelation approach. © 2011 IEEE

    Auditory-inspired pitch extraction using a Synchrony Capture Filterbank and phase alignment

    No full text
    The question of how harmonic sounds produce strong, low pitches at their fundamental frequencies, f0s, has been of theoretical and practical interest to scientists and engineers for many decades. Currently the best auditory models for f0 pitch, e.g. [1], are based on bandpass filtering (cochlear mechanics), half-wave rectification and low-pass filtering (haircell transduction and synaptic transmission), channel autocorrelations (all-order interspike interval statistics) aggregated into a summary autocorrelation, and an analysis that determines the most prevalent interspike intervals. As a possible alternative to autocorrelation computations, we propose an alternative model that uses an adaptive Synchrony Capture Filterbank (SCFB) in which groups of filters or channels in a filterbank neighborhood are driven exclusively (captured) by dominant frequency components that are closest to them. The channel outputs are then adaptively phase aligned with respect to a common time reference to compute a Summary Phase Aligned Function (SPAF), aggregated across all channels, from which f0 can be easily extracted. © 2014 IEEE

    Application of ERT, Saline Tracer and Numerical Studies to Delineate Preferential Paths in Fractured Granites

    No full text
    Accurate quantification of in situ heterogeneity and flow processes through fractured geologic media remains elusive for hydrogeologists due to the complexity in fracture characterization and its multiscale behavior. In this research, we demonstrated the efficacy of tracer-electrical resistivity tomography (ERT) experiments combined with numerical simulations to characterize heterogeneity and delineate preferential flow paths in a fractured granite aquifer. A series of natural gradient saline tracer experiments were conducted from a depth window of 18 to 22 m in an injection well (IW) located inside the Indian Institute of Technology Hyderabad campus. Tracer migration was monitored in a time-lapse mode using two cross-sectional surface ERT profiles placed in the direction of flow gradient. ERT data quality was improved by considering stacking, reciprocal measurements, resolution indicators, and geophysical logs. Dynamic changes in subsurface electrical properties inferred via resistivity anomalies were used to highlight preferential flow paths of the study area. Temporal changes in electrical resistivity and tracer concentration were monitored along the vertical in an observation well located at 48 m to the east of the IW. ERT-derived tracer breakthrough curves were in agreement with geochemical sample measurements. Fracture geometry and hydraulic properties derived from ERT and pumping tests were further used to evaluate two mathematical conceptualizations that are relevant to fractured aquifers. Results of numerical analysis conclude that dual continuum model that combines matrix and fracture systems through a flow exchange term has outperformed equivalent continuum model in reproducing tracer concentrations at the monitoring wells (evident by a decrease in RMSE from 199 to 65 mg/L). A sensitivity analysis on model simulations conclude that spatial variability in hydraulic conductivity, local-scale dispersion, and flow exchange at fracture-matrix interface have a profound effect on model simulations

    Collation of chewing efficiency and dentures with diverse occlusal schemes

    No full text
    Background: Rehabilitation of an edentulous patient nurtures satisfaction and this lies in the chewing ability provided by the prosthesis. Aim: To evaluate and compare the masticatory efficiencies of complete dentures with different occlusal schemes. Materials and Methods: Fourteen completely edentulous patients from the age group of 50-70 years were selected according to the inclusion criteria followed in this study. The dentures were made with three different occlusal schemes, i.e., anatomic occlusion without balancing, anatomic occlusion with balancing, and lingualized occlusion and stored in water till the date of denture insertion. Post-insertion instructions were given to the patients at the time of delivery of the dentures. Patients were recalled after seven days and then masticatory efficiency was performed. The test was performed using boiled peanuts and Sieve system. Statistical Analysis: One-way analysis of variance (ANOVA) test and unpaired t-test were carried out. Results: The obtained masticatory efficiency values with anatomic occlusion without balancing, anatomic occlusion with balancing, and lingualized occlusion LO were analyzed using one-way ANOVA test and unpaired “t” test. The tests showed that lingualized scheme had highest masticatory efficiency. Conclusion: Within the scope of this study, it can be concluded that the masticatory efficiency will be generally higher in patients provided with complete dentures fabricated using the lingualized occlusal scheme
    corecore